CHAPTER 1 Biostatistics 101 11
Comparing groups
In Part 4, we show you different ways to compare groups statistically.»
» In Chapter 11, you see how to compare average values between two or
more groups by using t tests and ANOVAs. We also describe their nonpara-
metric counterparts that can be used with skewed or other non-normally
distributed data.»
» Chapter 12 shows how to compare proportions between two or more groups,
such as the proportions of patients responding to two different drugs, using
the chi-square and Fisher Exact tests on cross-tabulated (cross-tab) data.»
» Chapter 13 focuses on one specific kind of cross-tab called the fourfold table,
which has exactly two rows and two columns. Because the fourfold table
provides the opportunity for some particularly insightful calculations, it’s
worth a chapter of its own.»
» In Chapter 14, you discover how the terminology used in epidemiologic
studies is applied to specifically formatted fourfold tables to calculate
incidence and prevalence rates.
Looking for relationships between variables
Epidemiology and biostatistics are interested in causal inference, which means try-
ing to figure out what causes particular outcomes in biological research. While it
is possible to look at the relationship between two variables in a bivariate analysis,
regression analysis is the part of statistics that enables you to explore the rela-
tionship between multiple variables and one outcome in the same model so you
can evaluate their relative cause of the outcome. Here are some use-cases for
regression:»
» You may want to know whether there’s a statistically significant association
between one or more variables and an outcome, even if there are other
variables in the model. You may ask: Does being overweight increase the
likelihood of getting liver cancer? Or: Is exercising fewer hours per week
associated with higher blood pressure measurements? In answering both
of those questions, you may want to control other variables known to
influence the outcome.»
» You may want to develop a formula for predicting the value of a variable from
the observed values of one or more other variables. For example, you may
want to predict how long a newly diagnosed cancer patient may survive
based on their age, obesity status, and medical history.